Search results for "Historical document"

showing 6 items of 6 documents

An Efficient Cooperative Smearing Technique for Degraded Historical Documents Images Segmentation

2020

Segmentation is one of the critical steps in historical document image analysis systems that determines the quality of the search, understanding, recognition and interpretation processes. It allows isolating the objects to be considered and separating the regions of interest (paragraphs, lines, words and characters) from other entities (figures, graphs, tables, etc.). This stage follows the thresholding, which aims to improve the quality of the document and to extract its background from its foreground, also for detecting and correcting the skew that leads to redress the document. Here, a hybrid method is proposed in order to locate words and characters in both handwritten and printed docu…

050101 languages & linguisticsComputer sciencemedia_common.quotation_subject02 engineering and technologyImage (mathematics)Interpretation (model theory)[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]0202 electrical engineering electronic engineering information engineering0501 psychology and cognitive sciencesSegmentationQuality (business)ComputingMilieux_MISCELLANEOUSmedia_commonbusiness.industrySmearing technique05 social sciencesPattern recognitionImage segmentationHybrid approachComputer Graphics and Computer-Aided DesignComputer Science Applications020201 artificial intelligence & image processingComputer Vision and Pattern RecognitionArtificial intelligencebusinessHistorical document

researchProduct

ICDAR 2021 Competition on Historical Document Classification

2021

International audience; This competition investigated the performance of historical document classification. The analysis of historical documents is a difficult challenge commonly solved by trained humanists. We provided three different classification tasks, which can be solved individually or jointly: font group/script type, location, date. The document images are provided by several institutions and are taken from handwritten and printed books as well as from charters. In contrast to previous competitions, all participants relied upon Deep Learning based approaches. Nevertheless, we saw a great performance variety of the different submitted systems. The easiest task seemed to be font grou…

Historical document imagesbusiness.industryComputer scienceDocument classificationDeep learningContrast (statistics)computer.software_genreVariety (linguistics)Task (project management)Competition (economics)Document classification[INFO.INFO-TS]Computer Science [cs]/Signal and Image ProcessingDocument analysisFontComputingMethodologies_DOCUMENTANDTEXTPROCESSINGDatingArtificial intelligence[SHS.HIST]Humanities and Social Sciences/HistorybusinesscomputerNatural language processingHistorical document

researchProduct

Reducing the Human Effort in Text Line Segmentation for Historical Documents

2021

Labeling the layout in historical documents for preparing training data for machine learning techniques is an arduous task that requires great human effort. A draft of the layout can be obtained by using a document layout analysis (DLA) system that later can be corrected by the user with less effort than doing it from scratch. We research in this paper an iterative process in which the user only supervises and corrects the given draft for the pages automatically selected by the DLA system with the aim of reducing the required human effort. The results obtained show that similar DLA quality can be achieved by reducing the number of pages that the user has to annote and that the accumulated h…

Iterative and incremental developmentTraining setInformation retrievalComputer sciencemedia_common.quotation_subjectQuality (business)SegmentationLine (text file)Document layout analysisHistorical documentmedia_commonTask (project management)

researchProduct

Writer identification for historical handwritten documents using a single feature extraction method

2020

International audience; With the growth of artificial intelligence techniques the problem of writer identification from historical documents has gained increased interest. It consists on knowing the identity of writers of these documents. This paper introduces our baseline system for writer identification, tested on a large dataset of latin historical manuscripts used in the ICDAR 2019 competition. The proposed system yielded the best results using Scale Invariant Feature Transform (SIFT) as a single feature extraction method, without any preprocessing stage. The system was compared against four teams who participated in the competition with different feature extraction methods: SRS-LBP, SI…

Writer identificationComputer sciencebusiness.industryFeature extractionhistorical documentsScale-invariant feature transform020207 software engineeringPattern recognition02 engineering and technologyartificial intelligenceConvolutional neural networkSupport vector machineIdentification (information)sift descriptors0202 electrical engineering electronic engineering information engineeringIdentity (object-oriented programming)Unsupervised learning020201 artificial intelligence & image processing[INFO]Computer Science [cs]Artificial intelligencebusiness

researchProduct

A Robust Multi Stage Technique for Image Binarization of Degraded Historical Documents

2017

International audience; Document image binarization is a central problem in many document analysis systems. Indeed, it represents one of the basic challenges, especially in case of historical documents analysis. In this paper, we propose a novel robust multi stage framework that combines different existing document image thresholding methods for the purpose of getting a better binarization result. CLAHE technique is introduced to significantly enhance contrast in some poor images. The proposed method then uses a hybrid algorithm to partition image into foreground and background. A special procedure is finally applied in order to remove small noise and correct characters morphology. Experime…

adaptive thresholdingComputer scienceHistorical document image analysis[SPI] Engineering Sciences [physics]ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION02 engineering and technologyhybrid algorithm01 natural sciencesGrayscaleElectronic mail010309 optics[SPI]Engineering Sciences [physics]Histogram0103 physical sciences0202 electrical engineering electronic engineering information engineeringNoise measurementbusiness.industryPattern recognitionImage segmentationglobal thresholdingThresholding[SPI.TRON] Engineering Sciences [physics]/Electronics[SPI.TRON]Engineering Sciences [physics]/ElectronicsComputingMethodologies_DOCUMENTANDTEXTPROCESSINGcontrast enhancement020201 artificial intelligence & image processingAlgorithm designAdaptive histogram equalizationArtificial intelligencebusiness

researchProduct

Towards semantic modelling of cultural historical data.

2010

In this paper a practical method is presented for creating documentation of cultural historical targets using an event-centric core ontology. By using semantic documentation templates and an XML-based query language, a domain specific documentation model can be created and flexible user interfaces can be built easily for accessing and editing the documentation. Keywords. ontologies, cultural historical documentation, information retrieval peerReviewed

ontologiatcultural historical documentationComputingMethodologies_DOCUMENTANDTEXTPROCESSINGtiedonhankinta

researchProduct